AITopics

Country: Asia > China (0.46)

Industry: Information Technology > Security & Privacy (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Neural Information Processing SystemsFeb-17-2026, 21:40:56 GMT

A Proof of Proposition 2.2: additive expansion proposition

We first define the restricted Cheeger constant in the link prediction task. Then, according to Proposition 2.1, we have: Then, we can draw the same conclusion with Eq.12, and the Thus, Eq.16 can be simplified to: "sites" Based on the Eq.15 and Eq.17, we can rewrite L The inequality holds due to the assumption. Knowledge discovery: In the 5 random experiments, we add 500 pseudo links in each iteration. The metadata information of the nodes are all strongly relevant to "Linux" Both papers focus on the "malware"/"phishing" under the topic "Computer security". The detailed result of the case study is shown in Table 6.

data mining, machine learning, proposition 2, (16 more...)

Country: Asia > Taiwan (0.05)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Software (0.96)
Information Technology > Artificial Intelligence > Machine Learning (0.90)
Information Technology > Data Science > Data Mining (0.57)

Neural Information Processing SystemsFeb-17-2026, 21:40:53 GMT

Deep Insights into Noisy Pseudo Labeling on Graph Data Botao Wang

Pseudo labeling (PL) is a wide-applied strategy to enlarge the labeled dataset by self-annotating the potential samples during the training process.

machine learning, natural language, prediction, (17 more...)

Country: Asia > China (0.46)

Industry: Information Technology > Security & Privacy (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Neural Information Processing SystemsFeb-17-2026, 17:11:55 GMT

A Appendix

Usage Application The model is intended for open-vocabulary object detection.Known Caveats

artificial intelligence, machine learning, social media, (17 more...)

Country: North America (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media (0.69)

arXiv.org Artificial IntelligenceNov-27-2025

From Bits to Rounds: Parallel Decoding with Exploration for Diffusion Language Models

Fu, Hengyu, Huang, Baihe, Adams, Virginia, Wang, Charles, Srinivasan, Venkat, Jiao, Jiantao

Diffusion Language Models (DLMs) have recently emerged as a strong alternative to autoregressive language models (LMs). DLMs offer comparable accuracy with faster inference speed via parallel decoding. However, standard DLM decoding strategies relying on high-confidence tokens encounter an inherent information-theoretic bottleneck that restricts decoding progress and ultimately slows generation. We demonstrate both theoretically and empirically that prioritizing high-confidence tokens is inherently inefficient. High-probability tokens carry negligible information and strictly relying on them limits the effective progress made in each decoding round. We prove that the number of decoding rounds must grow linearly with the sample's total information (negative log-likelihood) and inversely with the per-round information budget, establishing a bits-to-rounds principle. We also propose Explore-Then-Exploit (ETE), a training-free decoding strategy that maximizes information throughput and decoding efficiency. ETE combines cross-block decoding with targeted exploration of high-uncertainty tokens to reshape the conditional distribution and trigger cascades of confident predictions. Experiments verify our theoretical bounds and demonstrate that ETE consistently reduces the required number of decoding rounds compared to confidence-only baselines without compromising generation quality.

artificial intelligence, arxiv preprint arxiv, natural language, (16 more...)

2511.21103

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Apostu, Alexandru-Mihai, Preda, Andrei, Damir, Alexandra Daniela, Bolocan, Diana, Ionescu, Radu Tudor, Croitoru, Ioana, Gaman, Mihaela

AutoMalDesc: Large-Scale Script Analysis for Cyber Threat Research

arXiv.org Artificial IntelligenceNov-18-2025

Generating thorough natural language explanations for threat detections remains an open problem in cybersecurity research, despite significant advances in automated malware detection systems. In this work, we present AutoMalDesc, an automated static analysis summarization framework that, following initial training on a small set of expert-curated examples, operates independently at scale. This approach leverages an iterative self-paced learning pipeline to progressively enhance output quality through synthetic data generation and validation cycles, eliminating the need for extensive manual data annotation. Evaluation across 3,600 diverse samples in five scripting languages demonstrates statistically significant improvements between iterations, showing consistent gains in both summary quality and classification accuracy. Our comprehensive validation approach combines quantitative metrics based on established malware labels with qualitative assessment from both human experts and LLM-based judges, confirming both technical precision and linguistic coherence of generated summaries. To facilitate reproducibility and advance research in this domain, we publish our complete dataset of more than 100K script samples, including annotated seed (0.9K) and test (3.6K)

large language model, machine learning, natural language, (17 more...)

2511.13333

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.34)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Linna, Eljas, Baltakys, Kestutis, Iosifidis, Alexandros, Kanniainen, Juho

LOBERT: Generative AI Foundation Model for Limit Order Book Messages

arXiv.org Artificial IntelligenceNov-18-2025

Modeling the dynamics of financial Limit Order Books (LOB) at the message level is challenging due to irregular event timing, rapid regime shifts, and the reactions of high-frequency traders to visible order flow. Previous LOB models require cumbersome data representations and lack adaptability outside their original tasks, leading us to introduce LOBERT, a general-purpose encoder-only foundation model for LOB data suitable for downstream fine-tuning. LOBERT adapts the original BERT architecture for LOB data by using a novel tokenization scheme that treats complete multi-dimensional messages as single tokens while retaining continuous representations of price, volume, and time. With these methods, LOBERT achieves leading performance in tasks such as predicting mid-price movements and next messages, while reducing the required context length compared to previous methods.

lobert, machine learning, natural language, (18 more...)

2511.12563

Genre: Research Report (0.40)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.41)

Feifel, Patrick, Franke, Benedikt, Bonarens, Frank, Köster, Frank, Raulf, Arne, Schwenker, Friedhelm

Revisiting Evaluation of Deep Neural Networks for Pedestrian Detection

arXiv.org Artificial IntelligenceNov-14-2025

The reliable DNN-based perception of pedestrians represents a crucial step towards automated driving systems. Currently applied metrics for a subset-based evaluation prohibit an application-oriented performance evaluation of DNNs for pedestrian detection. We argue that the current limitation in evaluation can be mitigated by the use of image segmentation. In this work, we leverage the instance and semantic segmentation of Cityscapes to describe a rule-based categorization of potential detection errors for CityPersons. Based on our systematic categorization, the filtered log-average miss rate as a new performance metric for pedestrian detection is introduced. Additionally, we derive and analyze a meaningful upper bound for the confidence threshold. We train and evaluate four backbones as part of a generic pedestrian detector and achieve state-of-the-art performance on CityPersons by using a rather simple architecture. Our results and comprehensible analysis show benefits of the newly proposed performance metrics.

artificial intelligence, detection, machine learning, (14 more...)